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ABSTRACT 

The ALICE program, for Archival Legacy Investigation of Circumstellar Environment, is currently conducting 
a virtual survey of about 400 stars, by re-analyzing the HST-NICMOS coronagraphic archive with advanced 
post-processing techniques. We present here the strategy that we adopted to identify detections and potential 
candidates for follow-up observations, and we give a preliminary overview of our detections. We present a 
statistical analysis conducted to evaluate the confidence level on these detection and the completeness of our 
candidate search. 
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1. INTRODUCTION 

The development of advanced post-processing techniques based on the use of a library of instrument point 
spread function (PSE) images to create a synthetic PSF that optimally subtracts the residual starlight from a 
target image has enable significant progress in the direct imaging of extra-solar planets over the last decade. 
Previous PSF subtraction techniques, mainly consisting in one-to-one image subtraction either of a reference 
star or of an image of the science target itself with a different orientation of the field of view, were very efficient 
at imaging debris disks in scattered light around young nearby stars,but mostly failed at reaching the 
contrast limits needed to detect faint exoplanets.Advanced post-processing algorithms based on the linear 
combination of PSF images (LOCI and its variants)or on Principal Component Analysis (PCA)^°~^^ can 
reach deeper contrast limits with ground-based observations, with PSF diversity obtained with the Angular 
Differential Imaging (ADI)^^ or the Spectral Differential Imaging (SDI)^"^ observing strategies. They are also 
more efficient when using PSFs from many different stars (Reference star Differential Imaging, RDI), even when 
acquired with first-generation instruments on the Hubble Space Telescope (HST) and separated by large time 
intervals. This has been demonstrated by the re-discovery of HR 8799 planets in archival NICMOS data from 
1998 by 25 (b planet) and 17 (b,c, and d planets). 

These results made the community realize that a fraction of all previous results obtained with first-generation 
coronagraphic instruments might be out-dated by the development of these advanced post-processing techniques, 
and started the ALICE project (Archival Legacy Investigations of Circumstellar Environments)*. The goal of this 
program is to consistently reprocess the NICMOS coronagraphic archive with advanced post-processing methods. 
NICMOS was operating on-board HST for about 8 years between 1997 and 2008, and its mid-resolution channel 
NIC2 (pixel size 0.076”) was equipped with a 0.3”-radius coronagraphic mask and a Lyot stop. About 400 stars 
were observed during the instrument operations, mostly in the two wide-band filters FllOW and F160W as part 
of surveys looking for debris disks and planets around nearby stars. The ALICE pipeline^® assembles and aligns 
large PSF libraries from consistent subsamples of this database (acquired with identical filters and in the same 
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NICMOS era), that are used to process each individual targets with the KLIP algorithm."*^ This project has 
already revealed new images of 9 debris disks previously undetected from the NICMOS data, among which 8 had 
never before been imaged in scattered light (27, Choquet et al. in prep.). Many point sources are also uncovered 
in the data. This publication reviews the detections obtained so far as part of the ALICE program, and describes 
the method used to statistically assess the performance of our candidate search strategy. 

2. CANDIDATE SEARCH 

2.1 Candidate search strategy 

Our candidate search strategy is based on the following considerations: 

• High-contrast imaging surveys with ground-based instruments have revealed that giant exoplanets at large 
separations from their host-star are relatively rare, with occurrence rates varying between 0 and 20 % 
for planets of several Jupiter masses, depending on the separation range considered, on the atmospheric 
models used, and on the contrast limits achieved.In this regime, which is still un-probed by other 
detections techniques such as transit detection or radial velocity measurement, every new detection is 
critical to further constrain our planet formation models. Our objective is thus to detect all the potential 
companions in the archive, and adopt a conservative strategy which reports candidates even at a medium 
confidence level. 

• Since all the NICMOS datasets are at least 7 years old, rejection/confirmation of companion candidates 
by common proper motion analysis with the new generation of ground-based high-contrast imaging instru¬ 
ments will be straight-forward thanks to their better contrast performance at the separations probed by the 
ALICE program (~0.3-3”). While dedicated follow-up observations will be proposed for high-confidence 
candidates, having astrometric and photometric characteristics of detections at low-confidence level can be 
very valuable in the long term to confirm/reject candidates detected with other instruments by different 
teams, or be reported as false positives from contrast limits. 

• The James Webb Space Telescope (JWST) will offer coronagraphic capabilities at sensitivities and wave¬ 
length ranges unavailable from the ground from 2 to 5/xm (NIRCam) and 11 to 23 ^m (MIRI). Since it has 
a mission lifetime limited to about 5 years, the optimal use of its time for exoplanetary science would be 
for known companion characterization rather than candidate search and/or confirmation. It is thus critical 
to detect as many exoplanets as possible before JWST launch. 

As a consequence, the companion search strategy that we adopted consists of looking for point sources in a 
systematic way down to small separations and low contrasts, and reporting all detections even at low signal to 
noise ratio (SNR). With this strategy, a fraction of the reported detections will be false positives (i.e. speckles 
instead of real astrophysical objects). These detections will need to be prioritized by confidence level to select 
only the solid candidates that will be proposed for follow-up observations. We describe in Sec. 3.2 a statistical 
method to estimate the confidence level and search completeness of a given detection based on its SNR, that can 
be used to prioritize the ALICE candidates for further observations. 

The majority of the targets in the NICMOS archive have been observed with two different orientations of the 
telescope (hereafter called “roll images”), in order to subtract the star PSF by roll subtraction. When available, 
we make use of this characteristic to limit the fraction of false positives in our candidate search, by defining a 
detection as a point source identified independently in each roll image, with same astrometry within 1.5 pixels 
and same photometry within a factor of 3. These conservative parameters account for biases induced by the 
post-processing that can differ from one roll to the other (e.g. for faint point sources aligned with a diffraction 
spike in one roll but not in the other). 

The astrometry of a detection is estimated by finding the maximum of correlation between the image and 
a synthetic NIMCOS PSF (generated with the Tiny TIM package’^). The raw photometry is estimated by 
measuring the correlation at that position between the image and the normalized synthetic PSF and by correcting 
it from the local background level estimated in an ring from 6 to 10 pixels radii centered on the detection. The 


2 


50 





- 

IL, n~i-i-i-n-i-, ,-n-i 




Figure 1. Distributions in SNR, separation, and contrast, respectively from left to right, of the 290 detections found in 
the NICMOS archive to date as part of the ALICE project.They include low-contrast binaries, known background stars, 
and companion candidates, as well as a fraction of false positives. 


SNR of the detection (as observed in the image, uncorrected from the processing throughput) is estimated from 
the raw photometry and from the local noise level per resolution element measured by the standard deviation in 
the same ring as above of the image convolved with a synthetic NICMOS PSF. 

2.2 Intermediate overview of the detections 

Three large exoplanet surveys were conducted during NICMOS lifetime (programs 7226 PI E. Becklin, 7227 
PI G. Schneider, 10176 PI I. Song). To date we have re-analyzed 92% of their targets with the hnal version 
of the ALICE pipeline. We report 237 detections in these programs, and 304 detections total in the NICMOS 
coronagraphic archive. Fig. 1 presents the histograms of the SNR, separation and contrast of the 290 detections 
for which the host star has a J or H magnitude value reported in the SIMBAD database’^'"* * which we used to 
compute the detection contrast from their photometry in the FllOW or F160W filter using the Synphot synthetic 
photometry package ^ . 

The detections reported here include several categories of objects: binary companions, known background 
stars, unknown background star / companions (real candidates), and false positive detections (i.e. speckles). 
About 40 objects are solid detections with SNR greater than 10, and a large fraction of the rest of this sample is 
detected at low SNR, short separations and high contrast, and are by-products of the candidate search strategy 
described in Sec. 2.1. 


3. STATISTICAL ANALYSIS OF THE DETECTIONS 

In this section we describe a statistical method to estimate the efficiency of our candidate search strategy, first 
as a whole regardless of the detection characteristics, then specifically for candidates at a given separation and 
contrast. 

To optimize follow-up campaigns, our goal is to classify a detections between two categories: real astrophysical 
object (regardless if they are background stars, exoplanets, or binary companions) or false positives. The metric 
usually used for this binary classification is by setting a threshold on SNR of the detections above with it is 
considered as a real object instead of a speckle. We use here metrics from signal detection theory (first developed 
in the 1950’s for radar signal detections, now extensively used in medicine to evaluate diagnostic tests) to 
characterize our candidate search strategy and determine an optimal SNR detection threshold. 

For this analysis, we only consider a consistent subsample of the detections reported in Sec. 2.2, restricted to 
datasets obtained with the F160W filter in the second era of NICMOS (after replacement of its cooling system). 
The histograms in separation and contrast of the detections from this subsample are presented in Fig. 2, top 
row. 

^ http: / / simbad.u-strasbg.fr / simbad 

* Available on STSDAS website http://www.stsci.edu/resources/software_hardware/stsdas 
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Figure 2. Distributions in separation and contrast of the 76 detections found in data from the second era of NICMOS 
(after the replacement of its cooling system in 2002) in the F160W filter (top row), and of the 1000 fake candidates 
simulated to statistically assess the efficiency of our detection strategy (bottom row). The latter population has been 
simulated to have the same distributions in separation and contrast as the former detection list. 

3.1 Candidate search strategy global efficiency 

To evaluate our efficiency at discriminating real candidates from false positives in our sample, we perform 1000 
realizations of the two following tests: 

Test 1 A fake point source is injected in the NICMOS images of a target previously identified as a non-detection, 
at a given contrast level and sky-position. The dataset is processed with KLIP algorithm, the images are 
de-rotated to a common orientation with North up and combined, and the SNR of the point source is 
estimated, as performed with the ALICE pipeline. 

Test 2 The same non-detection images are similarly processed, de-rotated and combined, this time with no fake 
point source injected. The SNR of the speckle field is yet also measured at the same sky-position as in 
Test 1. 


The population of fake point sources is simulated to have the same distributions in separation and contrast as 
the subsample of real ALICE detections presented in Fig. 2.2. The separation and contrast histograms of the 
fake point sources are shown in the same figure, bottom row. Histograms of the SNR measured for both tests are 
reported in Fig. 3. The fake point sources injected in Test 1 are detected with a wide range of SNRs, depending 
on their contrast and separation, while the speckles in Test 2 are all detected at low SNR. 

Setting a SNR threshold above which a detection is identihed as a real candidate rather than a speckle 
classihes the fake point sources in Test 1 as true positives and false negatives, respectively, and the speckles in 
Test 2 as false positives and true negatives, respectively. Fig. 3 (left) shows the numbers of true positives and 
false positives detected in the 1000 realizations of these tests, as a function of the SNR threshold. 

The Receiver Operating Characteristic (ROC curve) of our candidate classihcation method is presented on 
Fig. 4, right. It shows the true positive rate (or sensitivity) of our method as a function of its false positive 
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Figure 3. Distributions in SNR of the fake candidates injected in NICMOS non-detection datasets (left), and of the speckle 
field at the same position in the datasets without the fake candidates injected (right). 


ROC 



Figure 4. Statistical analysis of the global efficiency of ALICE detection strategy. Left: Number of true positives (green 
dashed line) and of false positives (red dotted line) as a fnnction of the SNR threshold {SNRth) above which a detection 
is considered as a real candidate. Right: Receiver Operating Characteristic (ROC curve, defined by the true-positive 
rate as a function of false-positive rate) of ALICE candidate search strategy (purple line), compared with a completely 
random detection strategy (red dotted line) and a perfectly efficient detection strategy (green dashed line). Results for 
SNR thresholds of 1, 3, 5 and 10 are color-coded on both plots. 

rate (or specificity) and can be used to choose the SNR threshold working point that optimizes our candidate 
classification. In the contrast-separation regimes probed in this analysis, setting a detection threshold at SNR= 3 
typically gives ~ 100% confidence level in our detections (with few false positives) and a completeness of ^ 40% 
(with 60% of the real objects undetected). 

The ROC presented in Fig. 4 illustrates the performance of our systematic detection strategy, compared 
to a fully random detection classification (defined by equal rates of true positives and false positives). It also 
demonstrates the difficulty of finding all the candidates in a sample which is biased towards low-SNR point 
sources, generated using the distribution of our ALICE detections. 

3.2 Completeness per detection 

In this section we use the same statistical process to analyze our efficiency at detecting candidates at a given 
contrast and separation. 

For each ALICE detection identified with the F160W filter in the second era of NICMOS (sample presented 
in Fig. 2, top row), we realize the two following test with the 60 non-detection datasets available : 

Test 1 The ALICE detection is injected in each image of the non-detection dataset, with preserved sky-position 
and contrast with respect to its host-star. The images are then re-processed, de-rotated and combined, 
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Figure 5. Statistical analysis of the our efficiency at detecting a given candidate (here labeled as C19). Top-Left: Candidate 
C19 as seen in a NICMOS reprocessed dataset, detected at 2.8” from the star with a brightness 15 magnitudes fainter 
than the star. Top-Right: Distributions of the SNR of the detection measured when injecting C19 in 60 non-detection 
datasets at 2 different position angles (blue histogram), and of the speckle field measured at the same location in the 
datasets without injecting C19 (purple histogram). The wide range of SNR values in the first former sample reflects the 
difficulty to detect C19 at different position angles in a NICMOS dataset (alignment with diffraction spikes in some roll 
images, over-subtraction effects...). Bottom-Left: Number of true C19 positives and false C19 positives as a function of 
the SNR threshold above which a detection is considered as a real candidate. Bottom-Right: ROC curve for C19, showing 
our efficiency at detecting point sources with separations and contrasts similar to C19 in the NICMOS archive. 

and the SNR of the candidate is measured. The point source is then injected again in the same dataset at 
the same separation and contrast, but this time with 135° added to its position angle, to add azimuthal 
diversity in the analysis (in case the point source is aligned with a diffraction spike in some datasets), and 
repeat the process. 

Test 2 The non-detection dataset is similarly processed, and the SNR of the speckle field is measured at the 
same two positions where the ALICE candidate was injected in Test 1. 

The true positive and false positive rates are estimated from the SNR distributions from both tests for each 
candidate, as a function of the SNR detection threshold. The results of these tests are presented as an example 
for one candidate (“C19”) in Fig. 5. 


6 




















ROC for each candidate 



False Positive Rate 



Separation 

(arcsec) 

1 


12 

Delta Magnitude 


AUC 


Figure 6. Statistical analysis of the our efficiency at detecting each candidate. Left: ROC curve for each ALICE detection. 
Right: Area Under the ROC Curve (AUC) as a function of the separation and contrast of our candidates, which show 
our efficiency at detecting a given candidate (see text for details). The green area shows the parameter space where we 
are better than 95% efficient, both in completeness and confidence level for our detections. 


The ROCs of all candidates are presented in Fig. 6 (left). The candidates the most easy to detect (at large 
separation and low-contrast from the star) are systematically identihed with Test 1, never mistaken for speckles 
in Test 2, and the SNR histograms from the two tests are well separated. The ROCs of these solid candidates 
are close to the 100% true positive rate for 0% false positive rate, which is characteristic of a “perfect binary 
classiher”. For such candidates, the area under the ROC curve (AUC) is close to 1, the maximum value. On the 
other hand, candidates at small separation and high contrast are easily confused with speckles and present the 
greatest challenge. Their SNR histograms from Test 1 and Test 2 are largely overlapping, and their ROC curves 
are close to the equal rate of false positive and true positive detections - characteristic of a “random classifier” 
with an AUC value of 0.5. 

For a point source at given contrast and separation, the AUC value provides a metric that evaluates the 
combination of our completeness (ability to detect all the true candidates in the sample) and our conhdence level 
(tendency to include false positives in our sample) in our detections. Fig. 6 (right) presents the AUC values of 
the ALICE detections as a function of their separations and contrasts from their host-star. The green area shows 
the part of this parameter space where we are the most efficient in our detection strategy, with AUC > 95%. 
At large separation (^ 3”), we are complete in this regime down to contrast of delta magnitude ~ 14. At low 
contrast (Am ~ 10), we are complete down to a separation of ~ 0.8”. 

4. CONCLUSION 

We presented a preliminary overview of the detections found as part of the ALICE project by re-analyzing the 
NICMOS coronagraphic archive. We described the candidate search strategy that led to these detections, which 
include point sources at low SNR, short separations and high-contrast. To prioritize these detections for future 
follow-up observations, we conducted a statistical analysis consisting in injecting fake point sources with specific 
characteristics in non-detection NICMOS datasets to estimate true positive and false positive rates from these 
simulations. We found with this method that our detection strategy is complete with a high confidence level for 
separations down to ~ 0.8” at low-contrast levels, and for delta magnitude down to ~ 14 at 3” separation from 
the star. 
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